THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 


VOLUME 18, NUMBER 2 OCTOBER, 1946 


Effects of Amplitude Distortion upon the Intelligibility of Speech* 


J. C. R. LickLipER 
Psycho-Acoustic Laboratory, Harvard University, Cambridge, Massachusetts 


(Received July 1, 1946) 


URING the last several years, the problem 
of amplitude distortion in voice communi- 
cation has been under study at the Harvard 
Psycho-Acoustic Laboratory. This paper de- 
scribes briefly some of the conclusions to which 
the research has led. A more detailed account 
has been given in publications issued by the 
Psycho-Acoustic Laboratory.! 

The work to be described involved both articu- 
lation tests and judgments of quality. Several 
types of amplitude distortion were studied. The 
experiments were aimed, first, at quantifying 
the detrimental effects of amplitude distortion 
and, second, at exploring certain applications of 
non-linear circuits which promise to improve the 
efficiency of voice communication. 


TYPES OF AMPLITUDE DISTORTION 


Amplitude distortion may be defined as the 
deformation which results when a signal passes 
through a non-linear circuit. The non-linearity 
is best described with reference to the instan- 
taneous input-output characteristic of the circuit. 
The uppermost graph in Fig. 1A shows a linear 
input-output characteristic; since the output 
voltage is proportional to the input voltage, the 
output wave (at the right) resembles the input 
wave (at the bottom)—there is no amplitude 
distortion. The other graphs in Fig. 1A show 
non-linear input-output characteristics; since the 
output voltages are not proportional to the input 
voltages, the output waves are inexact repro- 
ductions of the input waves—they suffer from 
various amounts of amplitude distortion. 

There are, of course, as many varieties of 
amplitude distortion as there are curved lines. 


* This work was begun under Contract OEMsr-658 
between the Office of Scientific Research and Development 
and Harvard University, where it is continuing under 
Contract N5ori-76 with the U. S. Navy, Office of Research 
and Inventions. Report No. PNR-19. 

1See J. C. R. Licklider, Psycho-Acoustic Laboratory 
Report OSRD No. 4217 (15 November 1944), for a fuller 
treatment of most of the material described herein. (This 
and other Psycho-Acoustic Laboratory reports are avail- 
able through Publication Board, U. S. Department of 
Commerce, Washington, D. C 


The type illustrated in Fig. 1A is referred to as 
symmetrical peak clipping. Other types, which 
together with symmetrical peak clipping were 
selected for study, are shown in Figs. 1B and 2. 

In symmetrical peak. clipping, the peaks of 
the wave on both sides of the time axis are clipped 
off, and only the center part of the wave is left. 
The severity of distortion is indicated in terms 
of the number of decibels by which the peak 
amplitude of the wave is reduced. The peak 
amplitude, in the case of speech waves, is defined 
as the average of the peak instantaneous ampli- 
tudes of the words in the speech sample under 


(A) (B) 
ASYMMETRICAL 
PEAK CLIPPING 


SYMMETRICAL 
PEAK CLIPPING 


o+ 


INSTANTANEOUS OUTPUT VOLTAGE 
INSTANTANEOUS OUTPUT VOLTAGE 


HALF- WAVE 
RECTIFICATION 


-O+ 
INSTANTANEOUS 
INPUT VOLTAGE 


-O+ 
INSTANTANEOUS 
INPUT VOLTAGE 


Fic. 1. Cathode-ray oscillograms showing instantaneous 
input-output characteristics of the symmetrical peak 
clipper (A) and of the asymmetrical peak clipper (B): 
The oscillograms are to be interpreted in the same way as 
the familiar grid-plate transfer characteristics of vacuum 
tubes. The several characteristics in each set illustrate 
various amounts of clipping. Below each set of character- 
istics, an input wave is shown, and at the right of each 
characteristic, the corresponding output wave. The output 
wave may be thought of as having been determined by 
projecting each point in the input wave upward to the 
characteristic and then horizontally to the appropriate 
time position in the output plot. cz 


429 


430 J. 


(A) (B) 
CENTER CLIPPING LINEAR RECTIFICATION 


INSTANTANEOUS OUTPUT VOLTAGE 
INSTANTANEOUS OUTPUT VOLTAGE 


-o+ 
INSTANTANEOUS 
INPUT VOLTAGE 


-o+ 
INSTANTANEOUS 
INPUT VOLTAGE 


Fic. 2. Cathode-ray oscillograms showing instantaneous 
input-output characteristics of the center clipper (A) and 
of the linear rectifier (B): The center clipper removes the 
center part of the wave, leaving the peaks. The linear 
rectifier handles the part of the speech wave above the 
time axis and the part below the time axis independently. 
It is adjustable to provide linear amplification, half-wave 
rectification, full-wave rectification, and other operations 
intermediate between these familiar ones. 


test. In asymmetrical peak clipping (Fig. 1B), 
the wave is clipped on only one side of the time 
axis. Tests were conducted also with other peak- 
clipping characteristics which were similar to 
those shown in Fig. 1 except that the overload 
was less abrupt. 

The two forms of amplitude distortion illus- 
trated in Fig. 2 are designated as center clipping 
(2A) and linear rectification (2B). Center clipping 
eliminates the center part of the wave and passes 
only the peaks. The amount of center clipping 
is indicated in terms of the number of decibels 
by which the peak amplitude of the wave is 
reduced. Linear rectifiers handle the two parts 
of the wave on either side of the time axis as 
separate signals, amplifying or attenuating them 
independently and recombining them. Each of 
the two legs of the input-output characteristic 
of a linear rectifier is linear in itself, but the 
over-all characteristic is non-linear because the 
legs join at an angle. Each linear rectifier char- 


C. R. LICKLIDER 


acteristic is specified in terms of the angle 6 at 
which the variable leg meets the horizontal axis, 
the other leg being fixed at 45°. Half-wave and 
full-wave rectifiers, the familiar members of the 
family of linear rectifiers, are represented, re- 
spectively, by @=0° and by @= —45°. 

The distortions described above were intro- 
duced electronically into otherwise high quality 
audio communication circuits, and the intelligi- 
bility of speech transmitted over the circuits 
was determined by means of word articulation 
tests. 

Articulation testing methods have been dis- 
cussed at length elsewhere,? and the particular 
procedures employed in the present experiments 
have been described in detail in a previous 
report.! It will suffice, therefore, to note here 
that the speech material employed in the present 
articulation tests consisted of rather difficult 
monosyllabic words read by experienced an- 
nouncers, and that the results are expressed in 
terms of the percentage of these words heard 
correctly by experienced listeners. Fifty-percent 
word articulation may be thought of as marking 
the lower limit of satisfactory communication; 
it corresponds approximately to 90 percent 
sentence intelligibility. 


TESTS IN QUIET 


At the outset, the question of primary interest 
was: How markedly is intelligibility reduced by 
amplitude distortion? Tests were conducted, 
therefore, in quiet, with the announcers talking 
at a conversational level and with a comfortable 
signal in the listeners’ earphones. 

It was found that the reduction in intelligi- 
bility depended upon the type of distortion— 
but that, both with symmetrical peak clipping 
and with asymmetrical peak clipping, the re- 
duction was essentially nil. Even with the speech 
wave stripped down to one-tenth its original 
amplitude, 96 percent of the words were heard 
correctly, and “‘infinite’”’ peak clipping, obtained 
with the aid of a “‘flip-flop” circuit which reduced 
the speech to a series of rectangular, waves, left 
it surprisingly intelligible. 

Other types of distortion were more detri- 
mental than peak clipping. With the most severe 


2See James P. Egan, Psycho-Acoustic Laboratory 
Report OSRD No. 3802 (1 November-1944). 


AMPLITUDE DISTORTION AND SPEECH INTELLIGIBILITY 


form of linear rectification (full-wave rectifica- 
tion), speech was badly garbled. With extreme 
center clipping, the words sounded more like 
atmospheric static than like speech. 


TESTS IN AMBIENT NOISE 


Before considering further the extreme toler- 
ance for peak clipping which was noted in all 
tests in which the clipper acted upon a noise-free 
speech signal, it is of interest to note the results 
of tests in which both the talkers and the 
listeners were in ambient noise. Amplitude dis- 
tortion tends to be, on the whole, somewhat more 
deleterious when noise is mixed with the speech 
than it is when the speech is noise-free: intelligi- 
bility is reduced through the intermodulation 
of speech and noise. 

Figure 3 shows the effects of symmetrical peak 
clipping and center clipping upon the intelligi- 
bility of words spoken over an interphone system 
operating in uniform-spectrum noise at an over- 
all intensity of 110 db. About 80 percent of the 
words were heard correctly when there was no 
amplitude distortion. Progressively increasing 
amounts of peak clipping up to 15 db (solid 
curve) had essentially no effect upon intelligi- 
bility. There is some indication that 21-db peak 
clipping was slightly deleterious, but the striking 
feature of the curve for peak clipping is that 


100 


= Wee 0? te ee 
; PEAK CLIPPING 


60 


1 
y 
1 
\ 
1 
\ 
40 


20 


PER CENT WORD ARTICULATION 


CLIPPING IN DECIBELS 


Fic. 3. Effects of peak clipping and of center clipping 
upon communication in noise: Both the talkers and the 
listeners were in uniform-spectrum noise at an intensity 
of 110 db re 0.0002 dyne/cm?. A close-talking microphone 
was used so that, despite the intensity of the noise, the 
speech-to-noise ratio at the input terminals of the non- 
linear circuit was reasonably high. Under these conditions, 
peak clipping had very little effect upon intelligibility, 
but center clipping produced marked impairment. 


431 


100 


60 


60 


40 LINEAR RECTIFICATION 


PER CENT WORD ARTICULATION 


20 


45 30 1S ie} 15 -30 ~45 


@ IN DEGREES 


Fic. 4. Effects of linear rectification upon communication 
in noise: Both the talkers and the listeners were in simu- 
lated airplane noise at an intensity of 115 db. The severity 
of distortion is expressed in terms of the angle, 6, between 
the two legs of the input-output characteristic. Forty-five 
degrees represents linear amplification, 0° represents half- 
wave rectification, and —45° represents full-wave rectifi- 
cation (cf. Fig. 2B). 


even in rather intense noise the reduction in 
intelligibility caused by peak clipping is quite 
small. 

The dashed curve in Fig. 3 indicates that the 
marked tolerance for peak clipping is matched 
by an equally marked intolerance -for center 
clipping. A very small amount of center clipping 
is apparently not detrimental, but any amount 
over 2 db reduces the scores considerably. The 
extremely marked effect is caused by the fact 
that center clipping strips out the weak conso- 
nant sounds and leaves only the vowels which 
are less important than the consonants for 
intelligibility. 

The noise in the tests just described was 
approximately as intense as that encountered in 
high speed fighter aircraft, and it imposed a 
considerable stress upon communication. With 
the close-talking microphone (MC-253-A) used 
in the tests, however, the signal-to-noise ratio at 
the input terminals of the peak clipper was 
approximately 20 db. It is of interest to compare 
with the curve for peak clipping shown in Fig. 3 
the results of tests conducted under the same 
conditions except that a microphone (Western 
Electric 640-A) was used which picked up more 
of the noise. The talkers spoke across the face of 
the microphone, maintaining constant spacing 
(approximately 4 inch) between their lips and 


432 J. 


PER CENT WORD ARTICULATION 


50 60 70 80 90 100 


GAIN IN SPEECH CHANNEL IN DECIBELS 
(© DB * THRESHOLD OF AUDIBILITY FOR UNDISTORTED SPEECH IN QUIET) 


Fic. 5. Articulation vs. gain functions showing the effect 
of peak clipping upon intelligibility when noise enters the 
system only at the listener’s end: The curves show that, 
if the gain is held constant, peak clipping reduces intelligi- 
bility. In the range in which the intelligibility of unclipped 
speech is Seal: the decrement is marked. The reduction 
in the peak amplitude of the speech wave, however, is 
greater than the reduction in intelligibility. Each datum 
Bou is based on 200 words, each large square on 1800 
words. 


the center of the face of the microphone. Inas- 
much as the 640-A microphone provided excel- 
lent reproduction, the articulation scores were, 
in the absence of peak clipping, no lower than 
they were with the close-talking microphone. 
With 12-db peak clipping, however, the scores 
for the condenser microphone had fallen almost 
to 70 percent (cf. 80 percent with close-talking 
microphone), and with 24-db peak clipping only 
56 percent of the words were understood cor- 
rectly (cf. 70 percent with close-talking micro- 
phone). 

These results indicate that noise picked up by 
the microphone is very important in determining 
the effect of peak clipping upon intelligibility. 
The results of other tests indicate that the type 
of noise is as important as the intensity. Most 
types of noise, especially those with intense low 
frequency components and low peak factor, 
reduce the tolerance for peak clipping. But when 
impulsive noise with a high peak factor is mixed 
with the speech signal, peak clipping eliminates 
more noise than speech, and an improvement in 
intelligibility results. 

Figure 4 shows the effects of linear rectification 
upon the intelligibility of words heard over an 
interphone operating in simulated airplane noise 
at an intensity of 115 db. The close-talking 
microphone was used in these tests. Percent 
word articulation is plotted as a function of the 
angle 6, the angle between the variable leg of the 
input-output characteristic and the horizontal 


C. R. LICKLIDER 


axis. Intelligibility is little impaired by linear 
rectification until the distortion characteristic 
approaches half-wave rectification (@=0°). As 
6 passes 0°, intelligibility falls off rapidly. In 
airplane noise, speech is almost completely 
garbled by full-wave rectification (@= —45°). 


CONCLUSIONS BASED ON TESTS IN QUIET AND 
IN AMBIENT NOISE 


On the basis of the results just described, 
together with other articulation data and sub- 
jective estimates of quality, the following conclu- 
sions seem justified: 

1. Amplitude distortion affects quality some- 
what more severely than it does intelligibility. 

2. Amplitude distortion is more detrimental 
when non-impulsive noise is mixed with the 
speech at a point ahead of the non-linear circuit 
than it is in quiet: intelligibility is irrevocably 
lost when speech and noise are allowed to inter- 
modulate. When the noise consists of sharp, 
high amplitude pulses, however, peak clipping 
eliminates the noise peaks and improves intelli- 
gibility. 

3. Amplitude distortion is somewhat less detri- 
mental, especially insofar as quality is concerned, 
when noise enters the system at a point following 
the non-linear circuit: noise added at the listener’s 
end of the system to speech already distorted 
tends to cover up some of the effects of distortion. 

4. When various types of amplitude distortion 
are equated in terms of any of the conventional 
methods of quantifying distortion, peak clipping 
appears to be inherently less detrimental than 
other types. 

5. Asymmetrical peak clipping, which affects 
the wave at a particular amplitude level on only 
one side of the time axis, is less detrimental than 
symmetrical peak clipping at the same amplitude 
level on both sides of the axis. 

6. The effects of amplitude distortion intro- 
duced by non-linear circuits with gradual over- 
load characteristics are quite similar to the 
effects of sharp peak clipping. The principal 
difference is that the high frequency distortion 
products are less intense with the circuits having 
curvilinear characteristics. 

7. In quiet, the tolerance for peak clipping is 
greater if the frequency-response characteristic 
of the circuits preceding the clipper rises approxi- 


AMPLITUDE DISTORTION AND SPEECH INTELLIGIBILITY 


mately 6 db per octave than it is if there is no 
frequency discrimination. When airplane noise 
is picked up by the microphone, however, a 
uniform frequency characteristic ahead of the 
clipper appears to be preferable to the 6 db per 
octave rise. 


FURTHER EXPERIMENTS WITH PEAK CLIPPING 


Inasmuch as symmetrical peak clipping ap- 
peared to reduce the peak amplitude of the 
speech wave much more than its intelligibility, 
further experiments were conducted to find out 
what might be gained by clipping off the peaks 
of the speech wave. In these tests, an abbreviated 
procedure, involving recorded word lists and a 
single listener, was employed. Tests were con- 
ducted with various amounts of clipping, with 
various types and amounts of noise, and with a 
wide range of speech-to-noise ratios. 

Figure 5 illustrates the type of curves obtained 
in these tests. The particular data shown were 
obtained with moderately intense noise in the 
listeners’ earphones to simulate a situation in 
which the talker is in quiet and the listener is in 
noise. Percent word articulation is plotted 
against the gain of the speech amplifier, with 
peak clipping as the parameter. 

The lateral displacement of the curves indi- 
cates that, since peak clipping removed part of 
the wave and reduced the effective level of the 
signal, the gain had to be turned up to make the 
clipped speech as intelligible as the unclipped 
speech. It must be noted, however, that with 
24-db peak clipping the peak amplitude was 
reduced 24 db, yet the gain had to be increased 
only about 13 to 15 db to restore the original 
level of intelligibility. In terms of intelligibility- 
per-unit-peak-amplitude, therefore, 24-db clip- 
ping netted an improvement in performance of 
between 9 and 11 db. 

The advantage of peak clipping in a system of 
limited amplitude-handling capability is illus- 
trated in Fig. 6, in which the curves of Fig. 5 
have been replotted to show intelligibility as a 
function of the peak amplitude of the speech 
wave. This is the representation which is of 
principal interest if the speech is transmitted via 
modulated carrier or if, for any other reason, 
the peak-amplitude-handling capacity is the 
parameter which limits the performance of the 


433 


communication system. Figure 6 indicates that 
performance is better with peak clipping than 
without throughout the range in which intelligi- 
bility is marginal. The advantage provided by 
24-db clipping—expressed in terms of the hori- 
zontal displacement of the 24-db curve with 
respect to the curve for 0-db clipping—is as great 
as 11 db. With 12-db clipping, the saving in peak 
power required for 80 percent word articulation 
is 8 db. It is clear that amounts of peak clipping 
greater than 12 or 15 db yield diminishing returns. 

These results, supported by data from tests in 
quiet and from tests with more intense noise in 
the earphones, provided the basis for a conclusion 
which takes the form of a general principle: If a 
communication system has insufficient ampli- 
tude-handling capability to pass the peaks of 
speech and at the same time to provide an 
adequate intensity level, maximal intelligibility 
is obtained by clipping off the peaks and using 
the available power for the remainder of the 
wave. This principle has been studied extensively 
in two applications: one, in connection with the 
design of radio transmitters, in which the ampli- 
tude-handling capability of the system is limited 
at 100 percent modulation of the carrier*; the 
other, in connection with the design of hearing 
aids, in which the amplitude-handling capability 
of the system is limited by the listener’s threshold 


100 
80 
60 
40 


20 


PER CENT WORD ARTICULATION 


° 


60 70 80 90 100 110 


PEAK AMPLITUDE OF SPEECH WAVE IN DECIBELS RE 00002 DYNE /cw 


Fic. 6. Curves showing that clipped speech is more 
intelligible than unclipped speech if the waves are equated 
in terms of peak instantaneous amplitude: These curves 
were replotted from the data of Fig. 5. The improvement 
provided by clipping can be expressed in terms of the 
separation of the curves. If attention is focussed upon the 
horizontal separation at the level of 80-percent word 
articulation, the advantage can be estimated as equivalent, 
for example, to an 11-db saving in carrier power in a radio 
transmitter. 


3K. D. Kryter, Psycho-Acoustic Laboratory Report 
IC-83 (10 October 1944). J. C. R. Licklider and G. A. 
Roberts, Psycho-Acoustic Laboratory Report IC-100 (30 
June 1945). 


434 J. 


WITH LIMITER 
IN RECEIVER 


PER CENT WORD ARTICULATION 


RF CARRIER-TO-INTERFERENCE RATIO IN DECIBELS 


Fic. 7. Improvement in radio communication provided 
by the noise-limiting action of a clipper circuit in the audio 
section of the receiver: The solid curve was obtained with 
the clipper in operation, the dashed curve with the clipper 
disconnected. The interference consisted of very sharp 
electrical pulses, occurring at irregular intervals. The 
pulses, at an average repetition rate of 1000 per second, 
provided a realistic simulation of thunderstorm static. 
This application of peak clipping in radio receivers is 
fundamentally different from the application in radio 
transmitters. In the receiver, clipping serves to discriminate 
between speech and impulse noise. In the transmitter, 
clipping provides a means of “‘packaging’’ speech for 
efficient transmission via carrier. 


of tickle or of discomfort. The principle appears 
to be of wide applicability because a considerable 
improvement in intelligibility can often be ob- 
tained with as little as 12-db peak clipping, and 
that amount does not impair speech quality to 
any great degree. 

In estimating the improvement in performance 
to be expected from the employment of peak 
clipping and additional amplification in a com- 
munication system of limited peak power capa- 
bility, it is important to examine the reference 
condition carefully. In the tests described above, 
the reference was 0-db peak clipping. Zero-db 
clipping was defined for those tests as clipping 
which limits the peaks of the test words which 
are greater than average in amplitude and leaves 
the other words entirely unaffected. Thus the 
critical amplitude for 0-db clipping was the 
average of the peak amplitudes of the words in 
each test list. Inasmuch as the peak amplitude 
of the most intense word was usually about 5 db 
higher than the average, 0-db clipping in the 
notation used here would correspond to approxi- 
mately 5-db clipping in a notation in which the 
amplitude of the highest peak was taken as the 
critical amplitude. With such a notation, the 


- Hallowell Davis, et al., “The selection of hearing aids,” 
The Laryngoscope, 56, No. 3, 85-115 (March, 1946); and 
No. 4, 135-163 (April, 1946). 


C. R. LICKLIDER 


improvement provided by peak clipping would 
appear approximately 5-db greater than is indi- 
cated herein. With sentences, the variation in 
peak amplitude from one part of the speech 
material to another is even greater than it is 
with the words used in the articulation tests. 
Thus the actual increase in intelligibility which 
would result from the use of premodulation 
clipping in a radio transmitter might be either 
considerably greater or considerably less than 
the estimate based on the present tests: It would 
be greater if normally the modulation factor 
were kept low in order to avoid overmodulation 
and ‘‘splatter’’; it would be less if normally a 
considerable amount of overmodulation were 
allowed to occur. 

Finally, a quite different application of peak 
clipping is of interest. Some of the noise-peak 
limiters used in radio receivers to reduce the 
disturbing effects of impulsive radio interference 
are essentially peak clippers. Figure 7 shows the 
improvement of performance which was obtained 
by introducing a peak clipper into the audio 
section of an AM receiver operating against 
interference which consisted of 1000 randomly- 
spaced pulses per second. Usable communication 
(50 percent word articulation) was obtained with 
a carrier 20 db weaker than that required when 
no limiter was employed. The improvement in 
performance is caused, of course, by the fact 
that the non-linear circuit discriminates against 
noise and in favor of speech on the basis of the 
characteristic difference between the noise and 
speech wave forms. This application of clipping 
or limiting circuits has been studied extensively. 
It has been found that, under certain conditions, 
the advantage provided by noise limiting in the 
receiver is even greater than that afforded by 
premodulation clipping in the transmitter. 

The demonstration that peak clipping often 
has practical advantages makes it necessary to 
regard the problem of amplitude distortion from 
a somewhat altered viewpoint, especially insofar 
as the specification of permissible limits is con- 
cerned. It is clear that, for some purposes, design 
objectives and specifications must be written in 
such a way as to allow, or even to require, a 
considerable amount of peak clipping without, 
at the same time, removing the bar against other 
undesirable types of amplitude distortion. 


